-
Notifications
You must be signed in to change notification settings - Fork 792
[inlining] increasing accuracy of the code size assessment model #7648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Interesting idea. Do you have measurements on some benchmark perhaps? I'm not sure I have a good guess about this without data. |
I use assemblyscript bootstrap as bench
The store will use at least 3 bytes which is 1 byte larger than call, so inline it will increase 1 byte in WASM size. But it is depends on the frequency... But if we only keep the normal algorithm operation, It is definitely profitable. |
And I think it is worth to inline if we want a balance in both shrink level and optimize level. It will not increase code size much and inlining can reduce the cost of function call and introduce more optimization opportunity later. |
I just realized that I'm fixing the same issue with #7669 and #7670. (note: #7670 is based on #7669, it includes changes in #7669) My fix is different: I don't change code size measurements, instead improve the existing "trivial call" heuristic. @HerrCai0907 it would be interesting to see how much different #7670 makes on the same benchmark you run above, and whether it optimizes the same way. (Note: I need to rebase #7670, which I'll do now.) |
I think the PRs linked above should be "safer" in the sense that they won't cause accidentally inlining too much, because they don't change size calculations. But even with those merged I think this PR may make sense. Assuming we merge my PRs, I would make these changes in this PR:
(If we see a non-constant instruction between two So in a function like:
We count This can potentially cover code like:
This kind of thing can be generated when at the source code you have default arguments which some call sites omit, and the default value is a constant integer like 0 above. In other words, consider "size" as the worst case code size contribution to call sites, rather than the size of the function being inlined. (We could even consider the number of extra locals required when the function locals are not used in order, but I'm not bright enough to immediately see what kind of algorithm would be needed here.. and we're already probably getting very little benefit with all of this after the other two PRs linked above) |
Old code size measure is simply based on operation count in callee. It will miss the opportunity like
which is always happened for variable setter in OO language.
In this PR, we want to introduce inlining specific measurer to ignore this pattern for better inlining result.
The new measurer will ignore all occurrences of
local.get
that occur in sequence.The behavior of the new measurer will be consistent with that of the old measurer in:
global.get
orxxx.const
local.set
In a word, the new measure want to match this pattern: The function body only contain
local.get $param_n
and trivial consumer operation likexxx.add
xxx.store
if
etc.